9 research outputs found

    ArAutoSenti: Automatic annotation and new tendencies for sentiment classification of Arabic messages

    Get PDF
    The file attached to this record is the author's final peer reviewed version.A corpus-based sentiment analysis approach for messages written in Arabic and its dialects is presented and implemented. The originality of this approach resides in the automation construction of the annotated sentiment corpus, which relies mainly on a sentiment lexicon that is also constructed automatically. For the classification step, shallow and deep classifiers are used with features being extracted applying word embedding models. For the validation of the constructed corpus, we proceed with a manual reviewing and it was found that 85.17% were correctly annotated. This approach is applied on the under-resourced Algerian dialect and the approach is tested on two external test corpora presented in the literature. The obtained results are very encouraging with an F1-score that is up to 88% (on the first test corpus) and up to 81% (on the second test corpus). These results respectively represent a 20% and a 6% improvement, respectively, when compared with existing work in the research literature

    Building an effective and efficient background knowledge resource to enhance ontology matching

    Get PDF
    International audienceOntology matching is critical for data integration and interoperability. Original ontology matching approaches relied solely on the content of the ontologies to align. However, these approaches are less effective when equivalent concepts have dissimilar labels and are structured with different modeling views. To overcome this semantic heterogeneity, the community has turned to the use of external background knowledge resources. Several methods have been proposed to select ontologies, other than the ones to align, as background knowledge to enhance a given ontology-matching task. However, these methods return a set of complete ontologies, while, in most cases, only fragments of the returned ontologies are effective for discovering new mappings. In this article, we propose an approach to select and build a background knowledge resource with just the right concepts chosen from a set of ontologies, which improves efficiency without loss of effectiveness. The use of background knowledge in ontology matching is a double-edged sword: while it may increase recall (i.e., retrieve more correct mappings), it may lower precision (i.e., produce more incorrect mappings). Therefore, we propose two methods to select the most relevant mappings from the candidate ones: (1)~a selection based on a set of rules and (2)~a selection based on supervised machine learning. Our experiments, conducted on two Ontology Alignment Evaluation Initiative (OAEI) datasets, confirm the effectiveness and efficiency of our approach. Moreover, the F-measure values obtained with our approach are very competitive to those of the state-of-the-art matchers exploiting background knowledge resources

    A semi-supervised approach for sentiment analysis of arab (ic+ izi) messages: Application to the algerian dialect

    Get PDF
    In this paper, we propose a semi-supervised approach for sentiment analysis of Arabic and its dialects. This approach is based on a sentiment corpus, constructed automatically and reviewed manually by Algerian dialect native speakers. This approach consists of constructing and applying a set of deep learning algorithms to classify the sentiment of Arabic messages as positive or negative. It was applied on Facebook messages written in Modern Standard Arabic (MSA) as well as in Algerian dialect (DALG, which is a low resourced-dialect, spoken by more than 40 million people) with both scripts Arabic and Arabizi. To handle Arabizi, we consider both options: transliteration (largely used in the research literature for handling Arabizi) and translation (never used in the research literature for handling Arabizi). For highlighting the effectiveness of a semi-supervised approach, we carried out different experiments using both corpora for the training (i.e. the corpus constructed automatically and the one that was reviewed manually). The experiments were done on many test corpora dedicated to MSA/DALG, which were proposed and evaluated in the research literature. Both classifiers are used, shallow and deep learning classifiers such as Random Forest (RF), Logistic Regression(LR) Convolutional Neural Network (CNN) and Long short-term memory (LSTM). These classifiers are combined with word embedding models such as Word2vec and fastText that were used for sentiment classification. Experimental results (F1 score up to 95% for intrinsic experiments and up to 89% for extrinsic experiments) showed that the proposed system outperforms the existing state-of-the-art methodologies (the best improvement is up to 25%)

    Modèles et outils d'annotations pour une mémoire personnelle de l'enseignant

    No full text
    Within the technology enhanced learning research area, this thesis aims at defining and proposing to teachers a computerized memory as a personal knowledge management tool. This memory is based on the annotations that he has made on his pedagogical documents. The proposed memory extends the teacher's cognitive capacities by assisting him unobtrusively in the management of his knowledge which is necessary for the realization of his activities.By taking into account the teaching activity specificity (implied knowledge, context of the activity) in the memory models, it enables us to obtain both a teacher's dedicated memory and a teaching context-aware memory.Two versions of the tool were developed: a portable version and a web version (implemented by the company Pentila) which can be integrated into an LCMS.Dans le cadre des recherches sur les environnements informatiques pour l'apprentissage humain, cette thèse vise à définir et proposer à l'enseignant une mémoire informatisée comme outil de gestion de connaissances personnelles. Cette mémoire est construite a partir des annotations de l'enseignant sur les documents pédagogiques.La mémoire résultante étend les capacités cognitives de l'enseignant en l'assistant dans la gestion de ses connaissances personnelles, nécessaires à la réalisation de ses activités de manière non intrusive.La prise en compte des particularités de l'activité d'enseignement (connaissances impliquées, contexte de l'activité...) dans les modèles de la mémoire permet d'obtenir une mémoire qui est en même temps dédiée a l'activité d'enseignement et s'adapte au contexte de cette activité.Deux versions de l'outil sont développées : une version mobile et une version web (implémentée par l'entreprise Pentila) intégrable dans un ENT

    Semantic Annotation Tools for Learning Material

    No full text
    Abstract. This paper aims at providing the specification for semantic annotation tools for e-learning. From the specific requirements of annotating learning material, we categorize and evaluate the existing annotation tools, mainly general purpose ones. We illustrate two research prototypes of annotation tools we developed, and evaluate to what extend the specific requirements of annotating learning material are reached by these research prototypes.
    corecore